Improving Syllabification Models with Phonotactic Knowledge
نویسنده
چکیده
We report on a series of experiments with probabilistic context-free grammars predicting English and German syllable structure. The treebank-trained grammars are evaluated on a syllabification task. The grammar used by Müller (2002) serves as point of comparison. As she evaluates the grammar only for German, we reimplement the grammar and experiment with additional phonotactic features. Using bi-grams within the syllable, we can model the dependency from the previous consonant in the onset and coda. A 10fold cross validation procedure shows that syllabification can be improved by incorporating this type of phonotactic knowledge. Compared to the grammar of Müller (2002), syllable boundary accuracy increases from 95.8% to 97.2% for English, and from 95.9% to 97.2% for German. Moreover, our experiments with different syllable structures point out that there are dependencies between the onset on the nucleus for German but not for English. The analysis of one of our phonotactic grammars shows that interesting phonotactic constraints are learned. For instance, unvoiced consonants are the most likely first consonants and liquids and glides are preferred as second consonants in two-consonantal onsets.
منابع مشابه
Automatic Syllabification for Manipuri language
Development of hand crafted rule for syllabifying words of a language is an expensive task. This paper proposes several data-driven methods for automatic syllabification of words written in Manipuri language. Manipuri is one of the scheduled Indian languages. First, we propose a language-independent rule-based approach formulated using entropy based phonotactic segmentation. Second, we project ...
متن کاملA comparison of theoretical and human syllabification
A review of phonological syllabification theory reveals considerable controversy, with a number of conflicting theories put forward to explain this process. In this study the performance of five, French specific, syllabification procedures were compared and contrasted both against each other, using lexical analysis, and against human syllable boundary placement, using a metalinguistic syllable ...
متن کاملStructural constraints in the perception of English stop-sonorant clusters.
Native-language phonemes combined in a non-native way can be misperceived so as to conform to native phonotactics, e.g. English listeners are biased to hear syllable-initial [tr] rather than the illegal [tl] (Perception and Psychophysics 34 (1983) 338; Perception and Psychophysics 60 (1998) 941). What sort of linguistic knowledge causes phonotactic perceptual bias? Two classes of models were co...
متن کاملPii: S0010-0277(02)00014-8
Native-language phonemes combined in a non-native way can be misperceived so as to conform to native phonotactics, e.g. English listeners are biased to hear syllable-initial [tr] rather than the illegal [tl] (Perception and Psychophysics 34 (1983) 338; Perception and Psychophysics 60 (1998) 941). What sort of linguistic knowledge causes phonotactic perceptual bias? Two classes of models were co...
متن کاملAutomatic word stress marking and syllabification for Catalan TTS
Stress and syllabification are essential attributes for several components in text-to speech (TTS) systems. They are responsible for improving grapheme-to-phoneme conversion rules and for enhancing the synthetic intelligibility, since stress and syllable are key units in prosody prediction. This paper presents three linguistically rule-based automatic algorithms for Catalan text-to-speech conve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006